1 Background

The Penobscot River in Maine has been the object of environmental concern and focus for years. It is the largest watershed in New England and hosts the largest run of Atlantic salmon on the East Coast (Natural Resources Council of Maine, 2013; The Nature Conservancy, n.d.).. Despite its ecological importance, the watershed has experienced significant degradation due to high levels of mercury contamination by a chemical plant. In 1967, a pharmaceutical company, Mallinckrodt began discharging mercury into the river as part of its production process, leaving 13 tons of Mercury in the river by 1970 (Island Institute, n.d.).

Mercury is a particularly worrisome and challenging contaminant because it does not easily break down. The contamination has had a significant and lasting impact on the surrounding ecosystem and nearby communities. The state of Maine issued an advisory warning pregnant women against eating certain species harvested from the river, and certain lobster and crab fisheries have been shut down (Maine Department of Environmental Protection, n.d.).

However, due to the steadfast and tenacious efforts of environmental advocates, this story has a silver lining. Senior attorney Nancy Marks from the NRDC (the Natural Resources Defense Council) filed a lawsuit in 1998, and after two decades of litigation, a major legal breakthrough was achieved. In 2022, the district attorney of Maine settled the long-running case, requiring Mallinckrodt to pay $197 million in remediation for the mercury damage (A 22-Year Court Battle Ends with Justice for the Penobscot River, 2022; Penobscot River Remediation, n.d.).

Because this watershed has been the subject of extensive study to document the need for remediation, there is considerable data about its sediment history. NOAA has compiled datasets on sediment, soil, and tissue chemistry from various studies of the Penobscot watershed (Island Institute, n.d.). Using this data, we will examine mercurylevels in the sediment.

1.1 Research Questions and Hypotheses

Question 1: How do mercury contamination levels change over time in the sediments in the Penobscot River?

Question 2: Does mercury contamination increase with depth?

Hypothesis 1: Post closure of the site, we would expect that mercury levels in surface sediment will decrease over time and as they get further away from the site due to natural attenuation.

Hypothesis 2: Over time, mercury concentrations have decreased in the Penobscot River due to sedimentation in the river.

1.2 Dataset Information

The dataset was generated from NOAA’s DIVER (Data Integration, Visualization, Exploration and Reporting) system, which is part of the NOAA Damage Assessment, Remediation, and Restoration Program (DARRP). The data contains Mercury results from sediment samples collected in the Penobscot River region in Maine from June 1st, 1970 to May 5th, 2021. Data is not available for every month of the years provided, likely due to shifting priorities with monitoring or project specific challenges.

Item Value
Data Source NOAA DIVER
Date Range June 1st, 1970 to May 5th, 2021
Number of Records 28,058
Records with Coordinates 24,501

2 Data Exploration

We began by importing the raw NOAA Diver dataset and subsetting it to include only the columns relevant to our analysis, including site location, depth, depth type, chemical type (mercury), and concentration. Next, we flagged all records containing coordinate information and created a new subset consisting only of samples with valid coordinates. This subset was then converted into a spatial dataset for future mapping and spatial analyses.

After cleaning the dataset, we plotted sample depths against mercury concentrations, mapped the sampling locations onto the Penobscot River, and examined concentration trends over time. We plotted a reference line representing the mercury PEC (MacDonald et al., 2000), above which harmful effects to sediment-dwelling biota are likely to occur.

Concentrations of mercury in the Penobscot River by depth.

Figure 2.1: Concentrations of mercury in the Penobscot River by depth.

Concentrations of mercury in the Penobscot River over time. The red line represents the mercury PEC, which is 1.06 PPM (MacDonald et al. 2000).

Figure 2.2: Concentrations of mercury in the Penobscot River over time. The red line represents the mercury PEC, which is 1.06 PPM (MacDonald et al. 2000).

A preliminary review revealed a major data gap between 1970 and 2000, leaving insufficient information to conduct a long-term time-series analysis or evaluate the direct effects of the 1967 contamination event. This gap reflects broader historical issues in environmental monitoring and regulatory oversight during that period. To ensure analytical consistency, we restricted our final dataset to samples collected from 2000–2021.

3 Data Wrangling

Our initial exploration showed that some concentration results were coded as –9, NOAA’s indicator for missing data, so these records were excluded. We also identified two measurement units—PPM and mg/L—but because mg/L is not appropriate for solids and appeared in only one record, that entry was removed. Additionally, some samples had missing or inconsistent depth information, so an average depth value will need to be generated for further analysis.

4 Spatial Analysis

The analysis begins with orienting the viewers to the Penobscot River in Maine, in relation to mercury concentrations. The map was created using the shapefiles that were read in, and the subsequent spatial dataset subset during the original data wrangling. In the figures you can see that the highest concentrations are centered around the historical site itself.

This first map shows mercury concentration, in part per million (PPM) as a color gradient in the spatially situated river. As seen the highest values are around 108.63 PPM, which is 100 times that of the PEC.

Next, this maps provides a closer look at samples around the contamination site. This showed the high concentrations that are found immediately adjacent to the site.

5 Data Analysis

5.1 Multiple Linear Regression

We ran a multiple linear regression to determine if either variable, time or depth, had an impact on mercury concentration in the river. For this model, a P-value greater than the level of significance 0.05 implies that there is no statistically significant relationship, while a P-value less than 0.05 implies a statistically significant relationship.

## 
## Call:
## lm(formula = Max_Result_.Raw. ~ Avg_Depth_ft + Date, data = data_clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
##  -1.322  -0.695  -0.247   0.184 107.775 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   1.083e+00  2.969e-01   3.649 0.000265 ***
## Avg_Depth_ft -4.542e-02  1.811e-02  -2.508 0.012162 *  
## Date         -1.030e-05  1.849e-05  -0.557 0.577329    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.425 on 7288 degrees of freedom
## Multiple R-squared:  0.0009819,  Adjusted R-squared:  0.0007078 
## F-statistic: 3.582 on 2 and 7288 DF,  p-value: 0.02788

The results indicate that at least one predictor is related to concentration, but the effect is very small. The p-value of 0.02788 is below 0.05, which means that time, depth, or both have some linear association with concentration. However, the R-squared value is less than 0.01, indicating that the model explains almost none of the variation in concentration, so the relationship is extremely weak. To better understand which variable is driving this, we ran separate linear regressions for time and depth to determine which one, if either, is positively correlated with concentration.

5.2 Temporal Relationship

This analysis starts with specific wrangling into a month column to prepare the data for a time series analysis. Due to logistical and budget constraints, it seems that data was not regularly collected, meaning there are some gaps in the samples collected each month.

We first plotted a graph of merucry concentrations as a function of time. We see an unexpected peak in 2021.

## `geom_smooth()` using formula = 'y ~ x'
Mean monthly concentration of mercury over time.

Figure 5.1: Mean monthly concentration of mercury over time.

Next, we ran a simple linear regression to test if the positive trend line was indicative of statistical significance.

## 
## Call:
## lm(formula = Mean_Conc ~ Date, data = MonthlyData_final)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.2349 -0.5150 -0.1971  0.0655  8.0094 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.0284637  1.6243739  -0.633    0.530
## Date         0.0001307  0.0001061   1.232    0.224
## 
## Residual standard error: 1.298 on 49 degrees of freedom
## Multiple R-squared:  0.03006,    Adjusted R-squared:  0.01026 
## F-statistic: 1.518 on 1 and 49 DF,  p-value: 0.2237

The resulting p-value is 0.2237, which is greater than 0.05. Therefore, we conclude that the relationship between time and mean monthly concentration is not linear or statistically significant.

Because no linear trend was detected, we ran the seasonal Mann Kendall test to determine whether there is a monotonic trend exists that fluctuates with seasonal patterns.

## Score =  -6 , Var(Score) = 128
## denominator =  84
## tau = -0.0714, 2-sided pvalue =0.59588

This test produced a p-value of 0.596. This is not a significant p-value because it is >0.05, indicating that there is no statistically significant trend in the seasonal data. We then ran a non-seasonal trend test, which also produced a non-significant p-value of 0.7208.

## 
##  Mann-Kendall trend test
## 
## data:  MonthlyMercury_noseason_ts
## z = -0.35738, n = 51, p-value = 0.7208
## alternative hypothesis: true S is not equal to 0
## sample estimates:
##             S          varS           tau 
## -4.500000e+01  1.515833e+04 -3.529412e-02

Together, these results show that there is no temporal trend in contamination levels in the Penobscot River. Therefore, we must reject our first hypothesis in its entirety.

5.3 Depth Relationship

How do Mercury levels change, based on depth?

The following graph reflects the mercury concentrations from 1970 - 2021 by depth type (surface sediment or subsurface sediment). As seen in the graph, the 1970 has more surface sediment mercury concentrations which reflects the proximity to the initial pollution. Whereas later decades have more subsurface sediment mercury concentrations.

Plot of concentrations over time and their depth category. Surface depth refers to data in the top four inches, while subsurface data is any depth below that.

Figure 5.2: Plot of concentrations over time and their depth category. Surface depth refers to data in the top four inches, while subsurface data is any depth below that.

Mercury concentrations by depth.

Figure 5.3: Mercury concentrations by depth.

Mercury concentrations by depth in the top two feet of benthic sediment.

Figure 5.4: Mercury concentrations by depth in the top two feet of benthic sediment.

## 
## Call:
## lm(formula = Max_Result_.Raw. ~ Avg_Depth_ft, data = DepthPlot.data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
##  -0.967  -0.674  -0.259   0.164 107.786 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.97227    0.03622  26.847  < 2e-16 ***
## Avg_Depth_ft -0.10488    0.02714  -3.864 0.000112 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.44 on 7181 degrees of freedom
## Multiple R-squared:  0.002075,   Adjusted R-squared:  0.001936 
## F-statistic: 14.93 on 1 and 7181 DF,  p-value: 0.0001124

In our initial analysis of the first plot, we observed that subsurface sediment generally showed higher mercury concentrations over time. In the next figure, we found that mercury concentrations were primarily aggregated within the top 0–2 feet of depth. Because this upper zone is the most biologically active, we created another plot isolating only the top two feet. Even after this refinement, no significant temporal trend was detected.

To further assess the potential relationship between mercury concentration and depth, we conducted a linear regression analysis. Although the data did not visually suggest a trend, the regression results indicated evidence of a linear relationship, as the p-value of 0.0001124 was statistically significant being that it was < 0.05. However, the adjusted R-squared value of 0.001936 showed that the relationship is extremely weak.

Therefore, although a statistically detectable linear relationship exists, depth cannot be considered a meaningful predictor of mercury concentration because the effect size is negligible. We must conclude that our results are inconclusive, meaning we cannot confidently determine whether our hypothesis is true or false.

6 Conclusion

In conclusion, after conducting our temporal analysis, we found that we had to reject our first hypothesis that, post-closure of the site, mercury levels in surface sediment would decrease over time and with increasing distance from the source due to natural attenuation. We acknowledge that this outcome may be influenced by the large data gap in the NOAA dataset, which makes it difficult to analyze the years immediately following contamination. Despite this limitation, our spatial analysis showed that the highest mercury concentrations still remain closest to the original contamination site.

In our depth analysis, we also had to reject our second hypothesis. Although the regression produced a statistically significant p-value, the relationship was weak, limiting our confidence in the results. Incorporating additional data or more detailed sampling could potentially change these findings.

Next steps in this study would include continuing to collect and clean data, and possibly interpolating or otherwise manipulating the existing data points to create a more comprehensive temporal dataset. This would allow for a clearer understanding of long-term trends and improve the reliability of future analyses.

List of R squared values and p-values from linear regression.

Variable R squared value P-value
Time 0.01026 0.2237
Depth 0.001936 0.0001124

7 References

A 22-Year Court Battle Ends with Justice for the Penobscot River. (2022, October 11). https://www.nrdc.org/stories/22-year-court-battle-ends-justice-penobscot-river

Finally, a plan to remediate Penobscot River mercury—Island Institute. (n.d.). Retrieved December 10, 2025, from https://www.islandinstitute.org/working-waterfront/finally-a-plan-to-remediate-penobscot-river-mercury/

MacDonald, D. D., Ingersoll, C. G., & Berger, T. A. (2000). Development and Evaluation of Consensus-Based Sediment Quality Guidelines for Freshwater Ecosystems. Archives of Environmental Contamination and Toxicology, 39(1), 20–31. https://doi.org/10.1007/s002440010075

Mercury contamination in and along the Penobscot River, Bureau of Remediation and Waste Management, Maine Department of Environmental Protection. (n.d.). Retrieved December 10, 2025, from https://www.maine.gov/dep/spills/holtrachem/

Penobscot River Restoration Project. (2013, September 25). Natural Resources Council of Maine. https://www.nrcm.org/programs/waters/penobscot-river-restoration-project/

Restoring the Penobscot River. (n.d.). The Nature Conservancy. Retrieved December 10, 2025, from https://www.nature.org/en-us/about-us/where-we-work/united-states/maine/stories-in-maine/restoring-the-penobscot-river/

Settlement Details | Penobscot River Remediation. (n.d.). Penobscot RR. Retrieved December 10, 2025, from https://www.penobscotriverremediation.com/settlementdetails